Goto

Collaborating Authors

 belief net


Belief Net: A Filter-Based Framework for Learning Hidden Markov Models from Observations

Chen, Reginald Zhiyan, Chang, Heng-Sheng, Mehta, Prashant G.

arXiv.org Artificial Intelligence

Hidden Markov Models (HMMs) are fundamental for modeling sequential data, yet learning their parameters from observations remains challenging. Classical methods like the Baum-Welch (EM) algorithm are computationally intensive and prone to local optima, while modern spectral algorithms offer provable guarantees but may produce probability outputs outside valid ranges. This work introduces Belief Net, a novel framework that learns HMM parameters through gradient-based optimization by formulating the HMM's forward filter as a structured neural network. Unlike black-box Transformer models, Belief Net's learnable weights are explicitly the logits of the initial distribution, transition matrix, and emission matrix, ensuring full interpretability. The model processes observation sequences using a decoder-only architecture and is trained end-to-end with standard autoregressive next-observation prediction loss. On synthetic HMM data, Belief Net achieves superior convergence speed compared to Baum-Welch, successfully recovering parameters in both undercomplete and overcomplete settings where spectral methods fail. Comparisons with Transformer-based models are also presented on real-world language data.


Parallelizing Probabilistic Inference: Some Early Explorations

D'Ambrosio, Bruce, Fountain, Tony, Li, Zhaoyu

arXiv.org Artificial Intelligence

We report on an experimental investigation into opportunities for parallelism in beliefnet inference. Specifically, we report on a study performed of the available parallelism, on hypercube style machines, of a set of randomly generated belief nets, using factoring (SPI) style inference algorithms. Our results indicate that substantial speedup is available, but that it is available only through parallelization of individual conformal product operations, and depends critically on finding an appropriate factoring. We find negligible opportunity for parallelism at the topological, or clustering tree, level.


Incremental Probabilistic Inference

D'Ambrosio, Bruce

arXiv.org Artificial Intelligence

Propositional representation services such as truth maintenance systems offer powerful support for incremental, interleaved, problem-model construction and evaluation. Probabilistic inference systems, in contrast, have lagged behind in supporting this incrementality typically demanded by problem solvers. The problem, we argue, is that the basic task of probabilistic inference is typically formulated at too large a grain-size. We show how a system built around a smaller grain-size inference task can have the desired incrementality and serve as the basis for a low-level (propositional) probabilistic representation service.


Bayesian Error-Bars for Belief Net Inference

Van Allen, Tim, Greiner, Russell, Hooper, Peter

arXiv.org Artificial Intelligence

A Bayesian Belief Network (BN) is a model of a joint distribution over a setof n variables, with a DAG structure to represent the immediate dependenciesbetween the variables, and a set of parameters (aka CPTables) to represent thelocal conditional probabilities of a node, given each assignment to itsparents. In many situations, these parameters are themselves random variables - this may reflect the uncertainty of the domain expert, or may come from atraining sample used to estimate the parameter values. The distribution overthese "CPtable variables" induces a distribution over the response the BNwill return to any "What is Pr(H | E)?" query. This paper investigates thevariance of this response, showing first that it is asymptotically normal,then providing its mean and asymptotical variance. We then present aneffective general algorithm for computing this variance, which has the samecomplexity as simply computing the (mean value of) the response itself - ie,O(n 2^w), where n is the number of variables and w is the effective treewidth. Finally, we provide empirical evidence that this algorithm, whichincorporates assumptions and approximations, works effectively in practice,given only small samples.